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Patent 

Attorney's Docket No. 020600-280 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Patent Application of 

Gunter SCHMIDT et al 

Application No . : Unassigned 
(Corresponds to PCT/GB98/00130) 



Group Art Unit: Unassigned 



Examiner: Unassigned 



International Filing Date: January 15, 1998 

For: NUCLEIC ACID SEQUENCING 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Washington, D.C 20231 



Sir: 



Please amend the application as indicated. 



In the Claims : 

Kindly cancel original Claims 1-20 and substitute the following claims therefor: 
—21. A method for sequencing DNA, which comprises: 

(a) obtaining a target DNA population comprising a plurality of single-stranded 
DNAs to be sequenced, each of which is present in a unique amount in the same reaction 
zone and bears a primer to provide a double-stranded portion of the DNA for ligation 
thereto; 

(b) contacting the DNA population with an array of hybridization probes, each 
probe comprising a label cleavably attached to a known base sequence of predetermined 
length, the array containing all possible base sequences of that predetermined length and 
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the base sequences being incapable of ligation to each other, wherein the contacting is 
carried out in the presence of ligase under conditions to ligate to the double-stranded 
portion of each DNA the probe bearing the base sequence complementary to the single- 
stranded DNA adjacent the double-stranded portion thereby to form an extended 
doublestranded portion which is incapable of ligation to further probes; and 

(c) removing all unligated probes; followed by the steps of: 

(d) cleaving the ligated probes to release each label; 

(e) recording the quantity of each label; and 

(f) activating the extended double-stranded portion to enable ligation thereto; 
wherein 

(g) steps (b) to (f) are repeated in a cycle for a sufficient number of times to 
determine the sequence of each single-stranded DNA by determining the sequence of 
release of each label. 



22. A method according to claim 21, wherein the array comprises a plurality of 
sub-arrays which together contain all the possible base sequences, and wherein each sub- 
array is contacted with the DNA population according to step (b), unligated probes are 
removed according to step (c), and these steps are repeated in a cycle before step (d) so that 
all of the subarrays contact the DNA population. 
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23. A method according to claim 21, wherein the target DNA population is 
obtained by sorting an initial DNA sample into sub-populations and selecting one of the 
subpopulations as the target DNA population. 

24. A method according to claim 23, wherein the initial DNA sample is cut into 
fragments, each having a sticky end of known length and unknown sequence, which 
fragments are sorted into sub-populations according to their sticky end sequence. 

25. A method according to claim 21, wherein each single- stranded DNA is 
immobilized at one end. 

26. A method according to claim 21, wherein the label of each probe comprises 
a mass label, and the quantity of each label is recorded according to step (e) using mass 
spectrometry after release of the label in step (d). 

27. A method according to claim 21, wherein the known base sequence is 
blocked at its 3 'OH. 

28. A method according to claim 27, wherein the step (d) of cleaving the ligated 
probes to release each label unblocks the 3 '-OH of the extended double-stranded portion 
according to step (f). 
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29. A method according to claim 28, wherein the label of each probe is 
cleavably attached to the 3 '-OH of the base sequence. 

30. A method according to claim 21, wherein the base sequence of each probe is 
unphosphorylated at both 3 ' and 5' ends and step (f) comprises phosphorylating the 5'-OH 
of the extended double-stranded portion. 

31 . A method according to claim 21 , wherein the predetermined length of the 
base sequence is from 2 to 6. 

32. A method according to claim 31 , wherein the predetermined length of the 
base sequence is 4. 

33. A kit for sequencing DNA, which comprises an array of hybridization 
probes, each probe comprising a label cleavably attached to a known base sequence of 
predetermined length, the array containing all possible base sequences of that 
predetermined length and the base sequences being incapable of ligating to each other. 

34. A kit according to claim 33, wherein the known base sequence is blocked at 
its3'-OH. 
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35. A kit according to claim 34, wherein the label of each probe is cleavably 
attached to the 3'-OH of the base sequence to prevent ligation thereto. 

36. A kit according to claim 33, wherein the base sequence of each probe is 
unphosphorylated at both 31 and 51 ends. 

37. A kit according to claim 33, wherein the label of each probe comprises a 
mass label. 

38. A kit according to claim 33, wherein the predetermined length of the base 
sequence is from 2 to 6. 

39. A kit according to claim 38, wherein the predetermined length of the base 
sequence is 4. 

40. Use of a kit as defined in claim 33 for a method of sequencing DNA - 

REMARKS 

The present Amendment is intended to eliminate the use of multiple dependency and 
to add an Abstract to the Specification. 
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The examination and allowance of the application respectfully are requested. 

Respectfully submitted, 
Burns, Doane, Swecker & Mathis, L.L.P. 



By: CfiL(l; J 
Robin L. Teskin 
Registration No. 35,030 



P.O. Box 1404 

Alexandria, Virginia 22313-1404 
(703) 836-6620 

Date: July 15, 1999 
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Applicant or Patentee: Gunter Schmidt et al. 

Application or Patent No.: 09/341 ,641 

Filed or Issued: July 1 5, 1999 

For: NUCLEIC ACID SEQUENCING 

VERIFIED STATEMENT (DECLARATION) CLAIMING SMALL ENTITY STATUS 
(37 C.F.R. §§ 1.9(f) AND 1.27(c)) - SMALL BUSINESS CONCERN 

I hereby declare that I am 

[ ] the owner of the small business concern identified below: 

[X] an official of the small business concern empowered to act on behalf of the 
concern identified below: 

NAME OF CONCERN BRAX GROUP LIMITED 

ADDRESS OF CONCERN 1 3 Station Road 

Cambridge, CB1 2JB, GB 

I hereby declare that the above-identified small business concern qualifies as a small business 
concern as defined in 13 C.F.R. § 1.21 for purposes of paying reduced fees under Sections 41(a) 
and 41 (b) of Title 35, United States Code, in that the number of employees of the concern, 
including those of its affiliates, does not exceed 500 persons. For purposes of this statement, (1) 
the number of employees of the business concern is the average, over the previous fiscal year of 
the concern / of the persons employed on a full-time, part-time, or temporary basis during each of 
the pay periods of the fiscal year, and (2) concerns are affiliates of each other when either, directly 
or indirectly, one concern controls or has the power to control the other, or a third party or parties 
controls or has the power to control both. 

I hereby declare that rights under contract or law have been conveyed to and remain with the small 
business concern identified above with regard to the invention entitled NUCLEIC ACID 
SEQUENCING by inventor(s) Gunter Schmidt and Andrew Huqin Thompson described in 

[ ] the specification filed herewith 

[X] Application No. 09/341,641 , filed July 15, 1999 . 

[ ] Patent No. , issued . 



If the rights held by the above-identified small business concern are not exclusive, each individual, 
concern, or organization having rights to the invention is listed below,* and no rights to the 
invention are held by any person, other than the inventor, who would not qualify as an independent 
inventor under 37 C.F.R. § 1.9(c), or by any concern that would not qualify as either a small 
business concern under 37 C.F.R. § 1 .9(d) or a nonprofit organization under 37 C.F.R. § 1 .9(e). 

*NOTE: Separate verified statements are required from each named person, 
concern, or organization having rights to the invention averring to their status as 
small entities. (37 C.F.R. § 1.27.) 
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NAME 

ADDRESS. 



NAME 

ADDRESS 



[ ] individual [ ] smaii business concern [ ] nonprofit organization 



[ ] individual [ ] small business concern [ ] nonprofit organization 



I acknowledge the duty to file, in this application or patent, notification of any change in status 
resulting in loss of entitlement to small entity status prior to paying, or at the time of paying, the 
earlier of the issue fee and any maintenance fee due after the date on which status as a small 
entity is no longer appropriate. (37 C.F.R. § 1 .28(b).) 

I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code; and that such willful false statements may jeopardize the validity of the application, any 
patent issuing thereon, or any patent to whicj>jthis verified statement is directed. 



NAME OF PERSON SIGNING 




TITLE OF PERSON OTHER THAN OWNER 



ADDRESS OF PERSON SIGNING y^^JmJL) MAkJfff^. HO iO&ISjdO CA*&?> 




SIGNATURE 
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1TOCLE,I<:__ACJD SEQUENCING 

The present invention relates to a method for sequencing DNA and 
a kit for sequencing DMA. 

Traditional methods for sequencing nucleic acid such as DNA 
frequently require biological sub-cloning hosts and vectors. 
Such traditional methods generally require gel chromatography to 
acquire seauence information. These traditional methods are 
therefore often complicated multi-stage processes which are both 
time- consuming and labour intensive. 

PCT/US9 6 / 05 24 5 discloses a method of nucleic acid sequencing 
based on an iterative process of duplex extension along a single- 
stranded template. Duplex extension is effected by iigating 
probes to a region of the template primed with an initialising 
oligonucleotide. The probes of the above method are labelled 
preferably with a fluorescent dye. The dye identifies a single 
base at the ligation site. The probes are prevented from 
uncontrolled extension by having removable blockina groups at one 
of their terminals. 

7 3T ' 3 Is ? 5 ' C 0 i 0 9 suggests a method cf nucleic acid sequencing 
comprising sequentially extending a primer a pre -determined 
number cf bases at a time, the added bases being complementary 
to the bases being sequenced. This is achieved by contacting the 
nucleic acid with a labelled adaptor, the label being specific 
to the base sequence of the adapter. A population of adaptors 
is used having oligonucleotide sequences including ail possible 
permutations for a pre -determined number of bases. 

The present invention provides a method for sequencing DNA, which 
comprises : 

(a) obtaining a target DNA population comprising one or 
more single- stranded DNAs to be sequenced, each of which is 
present in a unique amount and bears a primer to provide a 
double -stranded portion of the DNA for ligation thereto; 
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(b) contacting the DNA population with an array of 
hybridisation probes, each probe comprising a label cleavably 
attached to a known base sequence of predetermined length, the 
array containing all possible base sequences of that 
predetermined length and the base sequences being incapable of 
ligation to each other, wherein the contacting is carried out 
in the presence of ligase under conditions to ligate to the 
double-stranded portion of each DNA the probe bearing the base 
seauence complementary to the single - stranded DNA adjacent the 
double- stranded portion thereby to form an extended double- 
stranded portion which is incapable of ligation to further 
probes ; and 

(c) removing all unligated probes; followed by che steps 

of : 

(d) cleaving the ligated probes to release each label; 

(e) recording the quantity of each label; and 

( f ) activating the extended double - stranded portion to 
enable ligation thereto; wherein 

(g) steps (b) to (f) are repeated in a cycle for a 
sufficient number of times to determine the sequence of the or 
each single - stranded DNA by determining the sequence of release 
of each label. 

In one embodiment the array comprises a plurality of sub-arrays 
which together contain all the possible base sequences, and 
wherein each sub-array is contacted with the DNA population 
according to step (b) , unligated probes are removed according to 
step (c) , and these steps are repeated in a cycle before step (d) 
so that all of the sub-arrays contact the DNA population. In 
this way, the array of hybridisation probes is presented to the 
DNA population in stages. For example, where the predetermined 
length of base sequence is 4 and the total number of possible 
base sequences is 256 (4 4 ) , cross-hybridisation between 
complementary 4-mer in the array can be avoided by contacting the 
DNA population with a first sub-array of 128 probes and, after 
removing all unligated probes, contacting with a second sub-array 
of 128 probes. 
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The target DNA population may be obtained by sorting an initial 
DNA sample into sub-populations and selecting one of the sub- 
populations as the target DNA population. Thus, if the initial 
DMA sample is large its size can be reduced by the sorting step. 
In a preferred arrangement, the initial DNA sample is cut into 
fragments, each having a sticky end of known length and unknown 
sequence, typically a length of from 2 to 6 , preferably about 4 
bases. The fragments may be sorted into sub -popular: ions 
according to their sticky end sequence. It is thought that a 
population or sub-population of at least GO fragments can be 
sequenced in parallel with an acceptable error rate "using a probe 
with a base sequence of 4 bases . 

Preferably, each single - stranded DNA is immobilised, usually at 
one end, for example on a solid support such as a bead. This has 
the advantage that removal of unwanted material can take place 
in solution and separation of the labels from the probes is 
facilitated- Preferably, the target DNA is immobilised prior to 
step (b) on the solid phase support. The solid phase support may 
conveniently be attached to the primer. 

The label may be any suitable label such as a fluorescent label, 
a radio label or a mass label. The identity of the label must 
be assignable to the respective base sequence so that 
identification of the label identifies the base sequence. In a 
preferred arrangement, the label of each probe comprises a mass 
label. Each mass label is uniquely identifiable in relation to 
every other mass label using a mass spectrometer. Typically each 
mass label has a distinct mass from every other mass label and 
preferably a single ionization state at the pH of analysis in a 
mass spectrometer. Each mass label preferably does not fragment 
in the mass spectrometer. Preferred mass labels do not interfere 
with the action of the ligase in the sequencing method or with 
any other of the molecular biology steps used in the invention. 



Where the label is a mass label, the quantity of each label 
corresponding to the ligated hybridisation probe is recorded in 
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step (e) after release of the label in step (di . Where the label 
is a fluorescent label, step (e) may precede step (d) and the 
quantity of fluorescent label present on the ligated probe is 
recorded before the label is released. 



In any one cycle of the method according to the invention it is 
essential that the base sequence of only one probe ligates to the 
double-stranded portion of each DNA . The base sequences of the 
probes of the array are therefore incapable of ligation to each 
other so that the extended double-stranded portion which is 
formed after ligation is incapable of ligation to further probes. 
In subsequent step (f ) , the extended double - stranded portion is 
activated to enable ligation thereto of a further probe in the 
next cycle. The base sequences may be incapable of ligation to 
each other either by requiring activation or by being blocked to 
prevent ligation thereto. 

In one embodiment of the invention the known base sequence is 
blocked at its 3 ' OH . According to this embodiment, primer 
extension sequencing takes place in the 5' to 3' direction. In 
another embodiment of the invention, the base sequences are 
capable of ligating to each other only when activated by 
phosphorylation. According to this embodiment, the base sequence 
of each probe is unphosphorylated at both 3' and 5' ends and 
activation step (f) comprises phosphorylating the 5' -OH of the 
extended double-stranded portion to enable ligation thereto. 

Advantageously, the step (d) of cleaving the ligated probes to 
release each label unblocks the 3' -OH of the extended double- 
stranded portion according to step (f) . In other words, step (d) 
and step (f) are one and the same. Preferably, the label of each 
probe is cleavably attached to the 3 '-OH of the base sequence. 
Thus, cleaving the label from the probe unblocks the 3 7 -OH so as 
to allow a new hybridisation probe to ligate thereto in the next 
sequencing cycle. 



Theoretically the predetermined length of the base sequence is 
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limited only by considerations of ligase fidelity. The longer 
the base sequence, the stronger the hybridisation will be between 
probe base sequence and single-stranded DNA. Thus, a length of 
10 or 11 is thought to be about the maximum before ligase 
fidelity becomes unacceptable. However, practically speaking, 
sequences of this length would require too many unique labels to 
be useful, whereas, shorter base sequences require fewer unique 
labels. Preferably, the predetermined length of the base 
sequence is from 2 to G , more preferably 4. 

The invention further provides a kit for sequencing DNA, which 
comprises an array of hybridisation probes, eacn probe comprising 
a label cieavabiy attached to a known base sequence of 
predetermined length, the array containing all possible base 
sequences of that predetermined length and the base sequences 
being incapable of ligation to each other. The array of 
hybridisation probes is preferably as defined above. The kit may 
further comprise instructions for use in a method of sequencing 
DNA. Use of the kit is therefore provided for a method of 
sequencing DNA, especially the method described above. 

Trie invention will now be described m further detail by way of 
example only, with reference to the accompanying drawings, in 
which : - 

FIGURES ia and lb show respectively first and second cycles of 
a preferred process according to the invention; 

FIGURES 2a and 2b show respectively first and second cycles of 

an alternative process according to the invention; 

FIGURE 3 shows typical adaptor molecules for use in the 

invention; 

FIGURE 4 shows a preferred method of producing target DNA for 
sequencing in accordance with the invention; 

FIGURE 5 shows a bar chart depicting the data for the first cycle 
of a 4-mer sequencing experiment, series I being without ligase 
and series 2 with ligase ; 

FIGURE G shows a bar chart depicting the data for the second 
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cycle of the 4-mer sequencing experiment, series I being without 
ligase and series 2 with ligase; and 

FIGURE 7 shows a bar chart depicting the data for the third cycle 
of the 4-mer sequencing experiment, series I being without ligase 
and series 2 with ligase. 

Parallel sequencing of sorted populations of nucleic acids by 
primer extension sequencing: 

This invention is a process that allows a heterogenous population 
of nucleic acid fragments, generated by various means, to be 
sequenced simultaneously. The process provides a novel strategy 
for sequencing genomic DNA that potentially could avoid the need 
for biological subclonmg hosts and vectors. 

Tne sequencing process described here allows one to produce 
nucleic acid fragment populations in a reproducible manner that 
can then be sorted into subsets and finally sequenced by an 
iterative process of ligation of probes to an immobilised single- 
stranded DNA molecule. 

Generation o-f a 
mixed nucleic 
acid population 

Outline of sequencing process. 



Sort molecules 
into subsets 



Sequence molecules 
within subsets 
simultaneously 



The sequencing steps use short single stranded oligonucleotides 
of a predetermined length to probe the sequence of single- 
stranded immobilised template nucleic acid fragments. Single- 
stranded regions adjacent to a primed region are determined by 
iigating the probe oligonucleotides to the primer and determining 
their identity on the basis of a tag carried by the 
oligonucleotides. The label determines the sequence of the 
oligonucleotide probes. Nucleic acid fragments are probed as 
heterogenous sets and sequence information is determined by 
measuring the quantity of label of correctly hybridised and 
ligated probes. 

Sequencing can be performed in either a 5' to 3' format or in a 
3' to 5' format. Uncontrolled extension in the 5' to 3' format 
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is prevented by reversibly blocking the 3' -OH at the terminus 
of the probes prior to addition of the probes to the extending 
primer. After ligation of probe to primer any unligated probe is 
washed away. The quantity of ligated probes is determined and the 
3' terminus is unblocked to allow the next cycle of probing to 
be performed on the extended primer. In the 3' to 5' format, 
uncontrolled extension of the primer is controlled by using a 
phosphorylation step to add a triphosphate entity onto the 
extending primer's 5' -OH group. Probes are synthesised without 
any phosphate groups at the 5' terminus, so after each addition 
of probes to the extending primer, the 5' -OH must be 
phosphorylated to permit further extension. 

The sequence of individual fragments is determined by comparing 
quantities of label for each type of probe in each cycle of the 
sequencing process with quantities derived in previous and 
subsequent cycles. The invention provides a method for analysing 
heterogenous sub-populations of nucleic acids without spatially 
resolving them. This is acheived by a signal acquisition and 
signal processing procedure that allows sequences to be 
identified on the basis of their relative quantities. 
This process does not require traditional gel methods to acquire 
sequence information. Since the entire process takes place m 
solution and is an iterative process, the steps involved could 
be performed by a liquid-handling robot. 

Sequencing large nucleic acid molecules: 

It is not necessary to sequence an entire molecule at once to 
determine its sequence, which is fortunate as it is a practical 
impossibility, at the moment, to sequence molecules as large as 
chromosomes. It is calculated that any given sequence 17 bp long 
should be unique within the human genome. Similar calculations 
can be performed for genomes that are of different sizes. This 
consideration means that large nucleic acids or entire genomes 
can be sequenced by degradation into short overlapping fragments, 
> 17 bp in length, which can then be sequenced and the total 
genome sequence can thence be reconstructed using software to 
determine contig overlaps. 
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Preparing a Nucleic Acid for Sequencing: 

To sequence a complete nucleic acid of significant size is 
practically very difficult. This process requires fragmentation 
of the target nucleic acid and sorting into sub-populations that 
are small enough to allow simultaneous sequencing. Various 
embodiments of the sorting process have been described previously 
in the Gene Profiling patent application and the prior sequencing 
application. Only a minor variation in the use of adaptors to 
provide distinct termini in a population of generic nucleic acids 
is discussed here. 

Immobilising a specific tezminus ir. a population of nucleic 
acids : 

An important factor is lmmohi 1 1 sat icn of nucleic acids at one 
terminus. This requires that an arbitrarily generated fragment 
have directionality, i.e. it requires two distinguishable 
termini. This can be achieved using adaptors. Two types of 
adaptors are required to identify two distinct termini. Exemplary 
adaptors are shown in the attached figures. Adaptor i provides 
immobilisation and the recognition site for a type II 
restriction enacnuclease that generates biunt-ended fragments, 
m this example the enzyme chosen is BsuRI which is methylation 
sensitive. DNA to be sequenced would be synthesised with 5- 
methyl cytosine while adaptors would be synthesised with 
unmethylated cytosine so that only adaptors would be sensitive 
to cleavage by BsuRI. Adaptor 2 provides a type lis restriction 
endonuclease recognition site or alternatively a restriction 
sight for a second ordinary type II restriction endonuclease. 

The adaptors need to be attached to the nucleic acid fragments. 
Effecting attachment depends on the means used to fragment the 
population, but assuming random fragmentation with some form of 
nuclease that generates known sticky-ends, ligation of forms of 
both adaptor types bearing complementary sequences will be 
effective or blunt -ended adaptors could be used as shown in 
Figure 3. This generates fragments of three types: fragments with 
both ends carrying adaptor 1, fragments with both ends carrying 
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adaptor 2 and thirdly fragments carrying adaptor 1 at one end and 
adaptor 2 at the other. Statistically the third type of fragment 
will be in the majority. If the immobilisation effector on 
adaptor 1 is biotin then the fragments carrying adaptor 1 can be 
immobilised on a solid phase matrix derivitised with avidin. The 
fragments carrying adaptor 2 at both ends can be washed away. 
Those fragments carrying two immobilisation adaptors might be 
immobilised at both termini depending on the fragment lengths. 
Cleavage with the type lis restriction endonuclease whose binding 
site is carried by adaptor 2 will generate ambiguous sticky-ends 
at one terminus of the fragments bearing both types cf adaptor. 
The fragments bearing two type 1 adaptors will be unchanged. The 
cleaved adaptor fragments can then oe washed away with the type 
lis restriction endonuclease. A second cleavage with the ordinary 
type II restriction endonuclease wnose cleavage site is m 
adaptor 1 will release the remaining immobilised fragments that 
bore one copy of each adaptor at their termini . Those fragments 
should have an ambiguous sticky-end at the terminus that bore 
adapter two and can thus be sorted as described below. Those 
fragments that carried two copies cf adapter 1 will have blunt - 
ended termini and will not bind th^ array and can thus be washed 
away. In this way a population cf nucleic acid fragments car. be 
specifically immobilised at one cemr.us w;:r. ir.e other terminus 
prepared for sequencing. As long as r. u 1 1 1 p 1 *j copies of each 
sequence is present then statistically tr.e vast ma gorily of 
seq^^nces should be represented m the pert of the population 
carrying both adaptors and thus every sequence s.nould 
sequenced at least once. Any gaps snould become apparent : r. tne 
contig reconstruction process and can then be specifically 
searched for using primers targeted at sequences flanking the 
gaps . 

Alternatively sorting can be left until a later stiep if adaptor 
2 bore a cleavage site for an ordinary type II restriction 
endonuclease that generated a known sticky-end. Preferrably a 
methylation senstive restriction enzyme would be required to do 
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this. The resultant fragments can then be immobilised on beads 
for further processing such as further amplification or in order 
to render the fragments single-stranded. One skilled in the art 
could almost certainly think of other methods of acheiving 
distinct termini. Furthermore, if a restriction map for the 
target DNA is known then designing adaptors or protocols to 
distinguish the termini of fragments is simpler. 

Generating single - stranded DMA for primer extension sequencing: 

This sequencing system requires single - stranded DNA" fragments to 
operate on. This is relatively trivial to generate. One need only 
use beads derivitised with a double - stranded oligonucleotide that 
has no terminal phosphate groups on its exposed 5' strand. 
Cleavage of the DNA fragments to be sequenced with an enzyme that 
leaves 5' phosphates or use of a kinase to generate 5' phosphate 
groups on these fragments is required so that ligation of these 
fragments to the beads can take place, see Figure 4. The 
ligation will leave the strand linked to the 5' terminus of the 
immobilised oligonuclectide with a nick. Raising the temperature 
or otherwise producing denaturing conditions will remove the 
nicked strand, leaving an immobilised single stranded DNA. 

DNA from phage M13 is single - stranded and this is often used as 
a sequencing vector to generate single- stranded templates for 
Sanger sequencing . 

Sorting molecules into subsets: 

Once a fragment population has been amplified and distinct 
termini established for each fragment, as described above the 
fragments with ambiguous sticky-ends can be sorted. Sorting can 
be effected in the same way as described in the Gene Profiling 
application GB 9618544.2 using beads derivitised with 
oligonucleotides complementary to the possible sticky-ends that 
might be generated. The sorting process can be repeated with the 
first sorted populations using adaptors to provide another 
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terminal type lis restriction endonuclease site. This will allow 
another set of ambiguous sticky-ends to be generated allowing 
further sub-sorting until the nucleic acid fragment population 
is of the correct size for unambiguous sequence determination. 

One can effect also sorting with oligonucleotides chips, allowing 
simultaneous analysis of fragments. This is particularly 
desirable as the quantities of reagents required would be much 
smaller than for a series of wells. This sorting method is 
compatible with fluorescence as a means of detection. A 
population of DNA fragments with an ambiguous sticky-end at one 
terminus can be sorted on an oligonucleotide chip by ligation of 
the exposed sticky-end to its complement. Thus for a 4 bp 
stiCKy-end, a chip with the 256 possible 4-mers present at 
ciscrete locations on its surface would be required. 



This sorting process above generates, for a 4 bp ambiguous 
sticky-end, 256 sub-populations. This may generate nucleic acid 
populations small enough to begin sequencing or further sub- 
sorting may be necessary. 

Primer Extension and Parallel Sequencing of Heterogenous 
Populations of Nucleic Acid Fragments: 

Sequencing a single molecule bv ligation of single stranded 

oil acnucl eo tides to a primer: 

This process can be understood first by explaining it for the 
case of a single nucleic acid. Consider a single nucleic acid, 
immobilised at one terminus to a fixed insoluble matrix. This 
molecule is rendered single stranded, except for a short stretch 
of double -stranded DNA at the immobilised terminus of the 
molecule. This primer sequence could be provided by the adaptor 
used to immobilise the terminus. 



To determine the sequence of this single- stranded molecule one 
can probe the immobilised nucleic acid with every one of the 
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possible 256 single-stranded 4 base oligonucleotides. Each of 
these would carry a unique identifying label corresponding to its 
known, sequence of 4 bp. In the 5' to 3' format (see Figures 2a 
and b) , the label could be attached to the 3' -OH effectively 
blocking them from further extension, or a separate blocking 
group can be used and the label can be attached elsewhere in the 
molecule. In the 3' to 5 ' format (see Figures la and b) there is 
no particular advantage in attaching the mass label to any 
particlar pare of the probe, except chat it is less likely to 
interfere with the ligase if it is added cc the terminus of the 
probe . 



If the oligonucleotides are added in the presence of a iigase, 
the oligonucleotide complementary to the 4 oases of sequence 
adiacent to - che primed double - stranded region, will be ligated 
co nhe primer. The immobilised matrix can then be washed to 
remove any unbound oligonucleotides. To determine the sequence 
of the 4 base oligonucleotide that ligated to the primer, one 
need only analyse the label attached to the 3' end of the 
oligonucleotide. The labelling system fcr use with this invention 
is described in a PCT patent application filed cencurrently witr. 
the present application (Page White & Farrer Ref : 86359;. This 
describes 'mass labelling' m whicn tr.e mass of the laoel 
identifies its carrier. Such labels can be made photolaoile or 
cleavable by a specific agent. Cleavage cf the label will release 
it into solution in which it can be injected into an elect rccpray 
mass spectrometer for analysis, which will determine the sequence 
of the oligonucleotide and furthermore, its quantity. 



In the preferred embodiment, a photolysable linker would connect 
the mass label to the 3 ' -OH which when cleaved would regenerate 
the 3' -OH with as high an efficiency as possible. The primer has 
then been extended by 4 known bases and the cycle can be repeated 
to determine the next 4 bp of sequence. This process can be 
repeated iteratively until the entire molecule has been 
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An alternative implementation to using photoiysable mass labels 
at the 3' -OH of each 4 - me r oligonucleotide would be to cap the 
3 ' -OH with a phosphate group. The mass-label could be attached 
to another part of the molecule from which it can be released 
independently of the uncapping reaction of the 3' terminus. 
Uncapping of the 3' terminus can be effected by washing the 
immobilised DKA with alkaline phosphatase which will readily 
remove the capping phosphate from the 3 ' -OH leaving it available 
for the next cycle of the sequencing process. 

Conceivably this system could be implemented with other labelling 
schemes, but most other labelling schemes do not generate 
sufficient, unique labels to be practical. Using fluorescence the 
same system could be implemented, but since only 4 good dyes are 
commercialiv available, the 4 bp oligonucleotides would have to 
be tested m 64 groups of 4, rather than all at once. Similar 
considerations apply to use of radiclabels, but here, each oiigo 
would be added one at a time. Other labels include carbohydrates, 
biotm amongst others. 

Actually mass - labelled oligonucleotides would probably be added 
m two sets of 128 such that each member in the first set would 
have its comuiement m the other set. This overcomes the problem 
of cross-hybridisation between complementary 4-mers. 

Sequencing a Population of Nucleic Acid Fragments : 
The same process can be applied to a heterogeneous population of 
immobilised nucleic acids allowing them to be analysed in 
parallel. To be successful when applied to a population of 
nucleic acids, this method relies on the assumption that 
statistically 1 out of 256 molecules within the total population 
will carry each of the possible 4 bp sequences adjacent to the 
double stranded primer region. If one sub-sorts one's nucleic 
acid population into manageable subsets of less than 256 
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fragments, one would expect that almost all will have different 
ambiguous sticky-ends (there is about a 1 in 1000 chance of there 
being 2 distinct DMAs having the same 4 bp sequence at any given 
point if 100 distinct sequences are analysed simultaneously) so 
for most purposes one can assume that a hybridisation signal 
corresponds to a single DNA type. This all assumes that DNA 
sequences are random sequences of bases which is not strictly 
true but is a sufficient assumption for the purposes of this 
invention. Obviously 1 in a 1000 is not a small probability and 
sequences will often have the same 4-mer in a sequencing cycle. 
However this invention includes an algorithm that can resolve to 
a great extent any possible ambiguities caused by this 
occurrence . 

Reconstructing Sequences of Taraet Nucleic Acids: 
Repetitions of the primer extension cycle will generate a matrix 
of quantities of label corresponding to each possible probe. 
Shown below is a possible matrix for all probes of 4 base pairs 
m lenath : 





Cycle 1 


Cycle 2 


Cycle 3 


Cycle 4 


Sequence to 










which label 










corresponds 










AAAA 
AAAC 
AAAG 


5 

10 
13 


24 
5 
9 


13 
9 

15 


7 

13 
17 


TTTG 
TTTT 


7 

17 


13 
10 


17 
7 


10 
14 



To reconstruct the sequences to which these quantities of label 
correspond, this invention may incorporate an algorithm for 
analysing such a data matrix. Such an algorithm and a computer 
program for employing the algorithm are described in detail in 
PCT/GB97/02734 . The algorithm attempts to identify a sequence 
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on the basis of its frequency, i.e. a sequence present at a given 
frequency will have every subsequence present at the same 
frequency. The algorithm searches through each column of the 
matrix and attempts to resolve label quantities, that may be sums 
of sequence frequencies into atomic quantities such that the same 
set of atomic quantities appear in all columns. The algorithm 
acheives this by comparing label quantities in a given column 
with those in the previous and the subsequent columns, except in 
the case of the first and last columns which can only be compared 
with the following and previous columns respectively. A given 
atomic quantity that appears in all columns is then assumed to 
correspond to a unique sequence . 

If two sequences have the same n-mer at a particular point in the 
sequence, these can be resolved by the quantitative nature of 
this system in that the quantity of a particular n-mer in a 
particular ligation will be the sum of the quantities of the two 
sequences that share the n-mer at the same point. These can be 
largely resolved by comparison of one cycle with previous and 
subsequent ligation cycles to identify such sums. This is made 
particularly simple if the sequences that are being analysed have 
been amplified by PCR such that the sequence m the lowest 
quantity is present at not less than half the quantity of the 
sequence with the greatest frequency, that is to say if the 
frequency range of sequences lies between some quantity N and 2N. 
This means that any sum of frequencies will be greater than 2N 
and hence readily detectable. 

Tnere may be occasional ambiguities that only give partial 
resolution of the sequences. Further resolution can be obtained 
by performing the same sequencing process for each sample twice. 
In each case the length of the probe is different, so for the the 
first sequencing attempt, probes of 4 base pairs would be used 
and for the second, probes of 5 base pairs would be used. 
Comparison of the two matrices will allow the sequences to be 
resolved with far fewer ambiguities. 
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Implementation of the invention: 

Practical details of implementing the process are described 
below . 



Adaptors, PCR Primers and Oligonucleotides: 

Construction of Oligonucleotides, Adaptors, Primers, etc: 

Details and reviews on the construction of oligonucleotides are 
available in numerous up :o date texts, which should allow one 
skilled in the art to construct primers, adaptors and any other 
oligonucleotides required by the invention: 

• Gait, M.J. editor, oligonucleotide Synthesis: A Practical 
Approach' , IRL Press, Oxford, 1990 

• Eckstein, editor, 'Oligonucleotides and Analogues: A Practical 
Approach' , IRL Press, Oxford, 1991 

• Kricka, editor, % Nonisotropic DNA Probe Techniques', Academic 
Press, San Diego, 1992 

• Kaucland, * Handbook of Fluorescent Probes and Research 
Cr.emicals', Molecular Probes, Inc., Eugene, 1992 

• Keller and Manack, * DNA Probes, 2nd Edition', Stockton Press, 
New York, 1993 

• Kessier , editor, * Nonradioactive Labeling and Detection of 
Siomolecules' , Springer- Verlag , Berlin, 1992. 

Of particular importance is the chemistry used to cap the 3' -OH 
of the probe oligonucleotides. Acid labile and base labile groups 
are well known and discussed in the texts above. Capping with a 
phosphate group is also possible using the above texts, such a 
group can then be controllably removed using a phosphatase such 
as alkaline phosphatase which is readily available. 



Conditions for Using Oligonucleotide Constructs; 

Details on effects of hybridisation conditions for nucleic acid 
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probes can be found in be found in references below: 

• Wetmur, Critical Reviews in Biochemistry and Molecular Biology, 
26, 227-259, 1991 

• Sambrook et al, 'Molecular Cloning: A Laboratory Manual, 2nd 
Edition' , Cold Spring Harbour Laboratory, New York, 1989 

• Hames , B.D., Higgins, S.J., 'Nucleic Acid Hybridisation: A 
Practical Approach', IRL Press, Oxford, 1988 

Ligation : 

Ligation of oligonucleotides is a critical aspect of the 
invention that must be considered. Chemical methods of ligation 
are known : 

• Ferris et ai, Nucleosides and Nucieot ides_8 , 4 C 7 - 414, 1989 

• Shabarova et al, Nucleic Acids Research 19, 4247 - 4251, 1991 

Preferably enzymatic ligation would be used as this has much 
higher fidelity. Preferred ligases would be 74 DNA iigase, T7 DNA 
ligase, E . coli DNA Iigase, Taq ligase, P:^ iigase and Tth 
1 igase . References to the literature are given below: 

• Lehman, Science 186, 790 - 797, 1974 

• Engler et al, * DNA Ligases', pg 3 - 30 in Boyer, editor, 1 The 
Enzymes, Vol 153', Academic Press, New York, 1932 

Protocols for use of ligases can be found in: 

• Sambrook et al, cited above 

• Barany, PGR Methods and Applications, 1: 5 - 16, 1991 

• Marsh et al, Strategies 5, 73 - 76, 1992 



Phosphorylation of Nucleic Acids: 
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When ligases and restriction endonucieases are used, there are 
changes made to the 5' phosphates of nucleic acid backbone sugar 
molecules . It is critical to this invention that extension of 
primers by ligated oligonucleotides be tightly controlled such 
that only one oligonucleotide is ligated to each extending primer 
in each cycle of the sequencing process. It is also possible to 
alter the phosphorylation state of oligonucleotides, adaptors or 
target nucleic acids during their synthesis or later, in versions 
of the process. Included are references to literature regarding 
use of phosphatases, kinases and chemical methods: 

• Horn and Urdea, Tetrahedron Lett. 27, 4705, 1986 

• Sambrook et al , cited above 

The 5 ' - hydroxy! gp of the oligonucleotides can be chemically 
phosd. by means of phosphoryl chloride (P0C1 3 ) . 

Restriction Endonucieases : 

Numerous type II and lis restriction endonucieases exist and 
could be used with this invention. Table 1 below gives a list of 
examples but is by no means comprehensive. A literary review of 
restriction endonucieases can be found in Roberts, R., J. Nucl . 
Acids Res. 18, 2351 - 2365, 1988. New enzymes are discovered at 
an increasing rate and more up to date listings are recorded in 
specialist databases such as REBase which is readily accessible 
on the internet using software packages such as Netscape or 
Mosaic and is found at the World Wide Web address: 
http://www.neb.com/rebase/. REBase lists all restriction enzymes 
as they are discovered and is updated regularly, moreover it 
lists recognition sequences, isoschizomers of each enzyme, 
manufacturers and suppliers and references to them in scientific 
literature. The protocol would be much the same irrespective of 
the type lis restriction endonuclease used but the spacing of 
recognition sites for a given enzyme within an adaptor would be 
tailored according to requirements and the enzymes cutting 
behaviour, (see figure n above) 
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Enzyme 



Name 



Recognition 



Cutting site 



sequence 



Fokl 
BstFsl 



Sf aNI 

Hgal 

Bbvl 



GGATG 
GGATG 
GCATC 
GAGGC 
GCAGC 



9/13 

2/0 

5/9 

5/10 

3/12 



Table 1: A sample of type Us restriction e ndonuc leases 

The requirement cf the process is the generation of ambiguous 
sticky-ends at the termini of the nucleic acids being analysed. 
This could also be achieved by controlled use of 5' to 3' 
exonucleases . Clearly any method that achieves the creation of 
such sticky-ends will suffice for the process. 

Similarly ordinary type II restriction endonuc leases required by 
this invention can be found in the reference sources listed 
above. Details on methylat ion sensitivity and other means of 
controlling enzyme action can be found in the references given 
m REBase or can be acquired from the manufacturers. 



Solid Phase Supports: 

A full discussion of solid phase supports can be found m Brenner 
PCT/US95 / 12 6 78 pg 12-14. This is an important issue m the use 
of f luorimetry to determine sequence abundance in that the design 
of supports will affect the acquisition of fluorescent signals 
which must be maximised for this process to be effective. 

Mass Spectrometry of labels on oligonucleotides: 

Electrospray mass spectrometry is the preferred technique for 
identification of labels attached to oligonucleotides since it 
is a very soft technique and can be directly coupled to the 
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liquid phase molecular biology used in this invention. For a full 
discussion of mass spectrometry techniques see: 

• R.A.W. Johnstone and M.S. Rose; "Mass Spectrometry for 
chemists and biochemists" 2nd edition, Cambridge University 
Press, 1996 . 



Mass labels : 

For any practically or commercially useful system it is important 
that construction of labels be as simple as possible. using as few 
reagents and processing steps as possible. A combinatorial 
approach in a which a series of monomeric molecular units are 
available to be used in multiple cominaticns with each other 

Amino acids : 

With a small number of amino acids such as glycine, alanine and 
leucine, a large number of small peptides with different masses 
can be generated using standard peptide synthesis techniques well 
known in the art. With more amino acids many more labels can be 
synthesised . 

• E . Atherton and R.C. Sheppard, editors, 'Solid Phase Peptide 
Synthesis: A Practical Approach', IRL Press, Oxford. 



Carbohydrates : 

Similarly carbohydrate molecules are useful monomeric units that 
can be synthesised into heteropolymers of differing masses but 
these are not especially amenable to ESMS . 

• Gait, M.J. editor, ^Oligonucleotide Synthesis: A Practical 
Approach' , IRL Press, Oxford, 1990 

• Eckstein, editor, ^Oligonucleotides and Analogues: A Practical 
Approach' , IRL Press, Oxford, 1991 
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Other labelling chemistries: 

Clearly almost any molecule can be tacked onto another as a 
label. Obviously the properties of such labels in the mass- 
spectrometer will vary. In terms of analysing biomolecules it 
will be important that the labels be inert, etc, as discussed 
previously. Cholesterol groups and glyceryl groups are 
possibilities that could be used but these are intrinsically 
relatively large molecules and the scope. 

Designing molecules with favorable mass - spec trome try purposes: 

One can synthesis labels using standard organic chemistry 
techniques . Such labels ought to carry amine derivatives, 
quaternary ammonium ions or positive sulphur centres if positive 
ions are sought. These have extremely good defection properties 
that generate clean sharp signals. Similarly, negatively charged 
ions can be used, so molecules with carboxyiate moieties can be 
used. Labels for MALDI mass spectrometry can be generated by 
derivitising known molecules that are excitable by UV laser 
light, such as sinapinnic acid or cinnamic acid, of which a 
number of derivatives are already commercially available. For a 
text on organic chemistry see: 

• Vogel's "Textbook of Organic Chemistry" 4th Edition, Revised 
by 3.S. Furniss, A.J. Kannaford, V. Rogers, P.W.G. Smith & A.R. 
Tatchell , Longman, 1978. 

Linkers : 

An important feature of this invention is attachment of labels 
to their relevant biomolecules and in the 5' to 3 ' sequencing 
embodiment, the need for removable blocking groups is also 
critical. For details on these issues see: 

• Theodora W. Greene, "Protective Groups in Organic Synthesis", 
1981, Wiley- Interscience 



Fluo rime try : 

Certain embodiments of the process could use oligonucleoti 
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bearing fluorescent labels. Detection of fluorescent signals can 
be performed using optical equipment: than is readily available. 
Fluorescent labels usually have optimum frequencies for 
excitation and then fluoresce at specific wavelengths in 
returning from an excited state to a ground state. Excitation can 
be performed with lasers at specific frequencies and fluorescence 
detected using collections lenses, beam splitters and signal 
distribution optics. These direct fluorescent signals to 
photomultiplier systems which convert optical signals to 
electronic signals which can be interpreted using appropriate 
electronics systems. 

Brenner PCT/US95/12678 pg 26 - 28 gives a full discussion. 



Liquid Handling Robotics: 

For this process to be practically useful, automation is 
essential and liquid handling robots can be acquired from various 
sources such as Applied Biosystems. 



Example- Sequencing by the ligation of 4-mers 

An. experiment was carried out involving the extension of a 
sequencing primer, hybridised to a single stranded DMA template, 
by the stepwise ligation of 4mers. In general the 4mers will 
contain labels with which the sequence of the 4mer and hence the 
template can be derived. 256 4mers are required to cover all 
possible variations. Each 4mer must contain a blocking group, 
preferably the identifying label, at the 3 ' hydroxy 1 to ensure 
that only one 4mer is ligated to the sequencing primer with each 
cycle. After successful ligation the blocking group (and the 
label if different) is removed by chemical or other means to 
expose the 3 ' hydroxy 1 of the 4mer. The label, and hence the 
sequence, is then identified. The 4mer is then available for the 
ligation of the next 4mer in the second cycle. 

In order to demonstrate the effectiveness of removing of the 
3 'blocking group of the ligated 4mer a non-blocked 4mer was 
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ligated to the sequencing primer in a separate reaction and this 
was then used as a template for the next cycle. The Experiment 
is depicted in schematic form below: 

Sequencing template - captured to streptavidin coated plate via 
a biotin molecule (B) : 

5 ' CTGGTACGTACATACGACTA ' 3 OH 

3 ' GACCATGCATGTATGCTGATACAGATGAATGTATTTGATAGTCCTAGCTAAAG5 ' B 
Cycle 1 

5 ' CTGGTACGTACATACGACTA' 3 OH 

3 ' GACCATGCATGTATGCTGATACAGATGAATGTATTTGATAGTCCTAGCTAAAG5 ' B 
5 ' ?04 -TGTC -3 ' FAM, 5' P04-TACT-3 ' FAM , 5 ' P04 -TAAA- 3 ' FAM 
S ' CTGGTACGTACATACGACTA - TGTC - FAM 

3 ' GACCATGCATGTATGCTGAT- ACAGATGAATGTATTTGATAGTCCTAGCTAAAG5 ' B 

Only 5 ' P04 - TGTC - 3 ' FAM ligates to give a signal which identifies 
the first 4 bases C3ACAG'5) of the template. 

To simulate the deprotection of the above species the following 
reaction was carried out: 

5 ' CTGGTACGTACATACGACTA' 3 OH 

3 ' GACCATGCATGTATGCTGATACAGATGAATGTATTTGAT (N) 14 - 5 ' B 
+ 

5 ' P04-TGTC-3 ' OH 



5 ' CTGGTACGTACATACGACTA- TGTC -3 ' OH 
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3 ' GACCATGCATGTATGCTGAT-ACAGATGAATGTATTTGAT (N) 14 - 5 ' 3 

The above species was then used as a template for Cycle 2. 

Cycle 2 

5 ' CTGGTACGTACATACGACTA- TGTC - 3 ' OH 

3 ' GACCATGCATGTATGCTGAT- ACAGATGAATGTATTTGAT (N) 14 - 5 ' B 
5' P04-TGTC-3 ' FAM, 5 ' P04 -TACT- 3 ' F AM , 5 ' P04 -TAAA- 3 ' FAK 
5 ' CTGGTACGTACATACGACTA - TGTC - TACT - FAM 

3 ' GACCATGCATGTATGCTGAT- ACAG- ATGAATGTATTT3 AT ( N ) 14 - 5 ' E 

Only 5 ' F04 - TACT - 3 ' FAM ligates to give a signal whicn identiii 
the next 4 bases ( 3 ATGA' 5 ) of the :enplate . 

Also to simulate the deprotect ion cf tn*r above specie: t 
following reaction was carried out : 

5 ' CTGGTACGTACATACGACTA- TGTC - 3 ' OH 

3 ' GACCATGCATGTATGCTGAT- ACAGATGAATGTATTTGAT K N ; 1 - 5 ' E 
+ 

5' P04 - TACT - 3 ' OH , 

5 ' CTGGTACGTACATACGACTA -TGTC -TACT -OH 

3 ' GACCATGCATGTATGCTGAT- ACAG- ATGAATGTATTTGAT (N) 14 - 5 ' B 



The above species was then used as a template for Cycle 3 . 
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Cycle 3 

5 ; CTGGTACGT ACATACGACTA - TGTC - TACT - OH 

3 ' GACCATGCATGTATGCTGAT- ACAG- ATGAATGTATTTGAT ( N) 14 -5 ' 3 
+ 

5 ' P04-TGTC-3 ' FAM, 5 ' P04 -TACA-3 ' FAM , 5 ' P04 - T AAA - 3 ' FAM 

5 ' CTGGTACGT ACATACGACTA - TGTC - TACT - TACA - FAM 

3 ' GACCATGCATGTATGCTGAT- ACAG - ATGA- ATGTATTTGAT (N) 14 -5 ' 3 

Only 5 ' P04 - TACA-3 ' FAM legates to give a signal which identifies 
the next 4 bases ( 3 ATGT ' 5 ) of the template. 

Therefore, through 3 cycles of ligation of 4mers the sequence 
ACAGATGAATGT of the template was deduced. 

Materials : 

Oligonucleotides : 

sequencing primer 

5 ' CTGGTACGTACATACGACTA ' 3 OH 

sequencing template (contains a 5' biotin molecule) 

3 ' GACCATGCATGTATGCTGATACAGATGAATGTATTTGATAGTCCTAGCTAAAG5 ' B 



4mers used 
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5 ' P04 -TGTC - 3 ' FAM , 5 ' PO 4 -TACT- 3 ' FAM , 5 ' P0 4 - TAGA - F AM , 

5 ' P04 - TACA - FAM , 5 ' P04 - TGTC - 3 ' OH , 5 ' ?04 - TACT - 3 ' OK 



All oligos were synthesised by Oswel DNA (UK) . 
Solutions : 

wash solution 5 OmM Tris-KCl pH7 . 6 

I OmM MgCl 2 

binding solution 5 OmM Tris-HCi pK7 . 6 

lOmM MgCl 2 
1M NaCl 

ligase buffer 5 OmM Tris-HCl pK7 . 6 

lOmM MgCl 2 
lOrr.M DTT 
ImM ATP 
50ug/ml BSA 

Methods : 

Hybridisation of the sequencing primer to the template 

Aliquots with 500ul of 0.5 times binding solution containing 
5pmol/ul of each of the sequencing primer and template were 
heated at 95oC for 5 mins and then allowed to cool to room 
temperature over 2 hours. They were then incubated at 4oC for 
1 hour and frozen at -20oC until used. 



This will now be referred to as 'the sequencing template 7 
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Capture of tlie Sequencing Template 

20pmol (4ul) + 21ul of binding solution was added to each well 
of a black streptavidin coated 96 well microcitre plate 
(Boehringer Mannheim) and incubated at room temperature for 1 
hour. The wells were then washed twice with 200ul of wash 
solution and once with 50ul of ligase buffer. The plates were 
then stored at 4oC until used. 

Cycle 1 

Three groups of reactions, one group with a specific 4mer ( TGTC ) 
and two with non-specific 4mers (TACT and TAAA) were set up as 
f ol lows . 

Four reactions were set up containing 5% PEG, 400 units of ligase 
(New England Bioiabs) and 100 pmol of 4mer in 25 ul of ligase 
buffer for the following 4mers : 5 ' P04 -TGTC3 ' - FAM , 
5 ' ?C4 - TACT 3 ' - FAM and 5 ' P04 -TAAA3 ' FAM . Also four reactions for 
the same 4mers were set up in the same way, but without the 
inclusion of the ligase to control for non-specific binding of 
the 4mers . 

To simulate a deprotected, successfully ligated 4mer to the 
sequencing template 48 reactions containing 5% PEG, 400 units of 
ligase (New England Bioiabs) and 100 pmol of 5 ' P04 -TGTC3 ' OH in 
25 ul of ligase buffer were set up. 

The above reactions were then added to wells of the microtitre 
plate containing the sequencing template and incubated at 4oC for 
3 0 minutes followed by 16oC for 1 hour. The wells were then 
washed 3 times with lOOul of wash solution. lOOul of wash 
solution was added to each well. The amount of 4mer ligated to 
the sequencing template was assessed by measuring the Florescence 
of any FAM molecule present using a Biolumin 960 fluorescent 
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microtitre plate reader (Molecular Dynamics) using the Xperiment 
1.1.0 software (Molecular Dynamics). 



Data for Cycle 1 

The following data for Cycle 1 are expressed as relative 
fluorescent units (RFUs) obtained from the reactions which 
contained ligase : 



TAAA-FAM 
10764 
9815 
11635 
12031 

mean 11069 



TACT- FAM 
10878 
9994 
12543 
11188 
11151 



TGTC-FAM 
120119 

97638 

98891 

95931 
103145 



The following data from Cycle 1 are expressed as relative 
fluorescent units (RFUs) obtained from the reactions which did 
not contain ligase: 



TAAA-FAM 

14605 

13638 

13938 

13826 

mean 14002 



TACT -FAM 

13987 

13692 

14823 

13117 

13905 



TGTC-FAM 

15134 

15370 

16019 

17849 

16093 



These data clearly show that TGTC-FAM has been specifically 
ligated to the sequencing template. The other 4mers / in the 



WO 98/31831 PCT/GB98/00130 

(29) 

presence of ligase, gave signals similar to those obtained from 
non-specific hybridisation control reactions. 

Therefore, this specific signal provides the first 4 bases 
(3'ACAG5')of the sequencing template. 

Cycle 2 

The following reactions were applied co the sequencing template 
to which the specific 5 ' P04 -TGTC-3 ' OH 4mer had been ligated (as 
described in Cycle 1) to mimic a 4mer which had been 
deprotected/ identified . 

Tnree groups of reactions, one group with a specific 4mer (TACT) 
and two with non-specific 4mers (TGTC and TAAA) were set up as 
f ol lows . 

Four reactions were set up containing 5% PEG, 400 units of ligase 
> New England Biciabs) and 10C pmoi of 4mer m 25 ul of ligase 
buffer for the following 4mers: 5 ' P04 - TGTC3 ' - FAM , 
5 ' P04 - TACT3 ' - FAY and 5 ' P04 - TAAA3 ' FAM . Also four reactions for 
the same 4mers were set up in the same way but without the 
inclusion of the ligase to control for non-specific binding of 
the 4mers. 

To simulate a deprotected, successfully ligated 4mer to the 
sequencing template 24 reactions containing 5% PEG, 400 units of 
ligase (New England Biolabs) and 10 0 pmol of 5 ' P04 -TACT3 ' OH in 
25 ul of ligase buffer were set up. 



The above reactions were then added to wells of the microtitre 
plate containing the sequencing template, with 5 ' P04 -TGTC-3 ' OH 
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ligated to it as described in cycle 1, and incubated at 4oC for 
3 0 minutes followed by 16oC for 1 hour. The wells were then 
washed 3 times with lOOul of wash solution. lOOul of wash 
solution was added to each well the amount of 4mer ligated to the 
sequencing template was assessed by measuring the Florescence of 
any FAM molecule present using a Bioiumin 960 fluorescent 
microti tre plate reader (Molecular Dynamics) using the Xperiment 
1.1.0 software (Molecular Dynamics) . 



Data for Cycle 2 

The following data for Cycle 2 are expressed as relative 
fluorescent units (RFUs) obtained from zhe reactions which 
contained ligase : 



7 AAA- FAM 
9238 
8207 

10312 
9153 

mean 9227 



TACT -FAM 

24071 

24455 

23194 

21641 

23340 



9693 
94 15 
1 1 07 1 

i c a 1 5 

10248 



The following data from Cycle 2 are expressed as relative 
fluorescent units (RFUs ) obtained from the reactions whicr. did 
not contain ligase: 



TAAA-FAM 
12532 
11947 
12040 



TACT -FAM 
16025 
15651 
17587 



TGTC-FAM 
13917 
13573 
13049 
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11908 16464 12998 

mean 12107 16432 13384 

As with Cycle 1, Cycle 2 produces a clear signal from the 
specific 4mer ligation as compared to the non-specific 4mer 
ligations and the non-specific hybridisation control reactions 
which lacked ligase. 

Therefore cycle 2 has produced the next 4 bases ( 3 ' ATGA5 ' ) of the 
sequencing template . 

Cycle 3 

The following reactions were applied the sequencing template to 
which the specific 5 ' P04 - TACT - 3 ' OH 4mer had been iigated (as 
described in Cycle 2) to mimic a 4 me r which had been 
depro tec ted/ identified . 

Three groups of reactions, one group with a specific 4mer (TACA) 
and two with non-specific 4mers (TGTC and TAAA) were set up as 
f oi lows . 

Four reactions were set up containing 5% PEG, 400 units of ligase 
(New England Biolabs) and 100 pmoi of 4mer in 25 ul of ligase 
buffer for the following 4mers: 5 ' P04 - TGTC3 ' - FAM , 
5' P04-TACT3 ' -FAM and 5 ' P04 -TAAA3 ' FAM . Also four reactions for 
the same 4mers were set up in the same way, but without the 
inclusion of the ligase to control for non-specific binding of 
the 4mers . 



The above reactions were then added to wells of the microtitre 
plate containing the sequencing template, with 5 ' P04-TACT-3 ' OH 
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ligated to it as described in cycle 2, and incubated at 4oC for 
30 minutes followed by 16oC for 1 hour. The wells were then 
washed 3 times with lOOul of wash solution. lOOul of wash 
solution was added to each well the amount of 4mer ligated to the 
sequencing template was assessed by measuring the Florescence of 
any FAM molecule present using a Biolumm 960 fluorescent 
microt itre plate reader (Molecular Dynamics) using the Xperiment 
1.1.0 software (Molecular Dynamics). 



Data for Cycle 3 

The following data for Cycle 3 are expressed as relative 
fluorescent units (RFUs) obtained from the reactions which 
contained ligase : 



T AAA - FAM 


TACA-FAM 


tgtc-: 


8294 


61002 


10307 


8136 


52253 


9659 


10323 


53848 


11894 


9424 


51570 


12443 


9044 


54668 


11076 



The following data from Cycle 2 are expressed as relative 
fluorescent units (RFUs) obtained from the reactions which did 
not contain ligase: 



TAAA-FAM 
11605 
11417 
11995 



TACA-FAM 
16641 
15414 
17719 



TGTC -FAM 
14000 
14704 
14443 
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11959 16021 14381 

mean 11744 16449 14382 

As with Cycles 1 and 2, Cycle 3 produces a clear signal from the 
specific 4mer ligation as compared to the non-specific 4mer 
ligations and the non-specific hybridisation control reactions 
which lacked ligase. 

Therefore cycle 3 produced the next 4 bases {3'ATGT5') of the 
sequencing template . 

A total of 12 bases ( 3 ' ACAG ATGAATGT 5 ' ) were successfully 
sequenced by three rounds of ligations using a fluorescent system 
which does not require the use of gel electrophoresis. 

The specific 4mers (e.g. TACA in Cycle 3) generally give a 
slightly higher reading in the non-specific hybridisation 
reactions, without ligase, compared to the non-specific 4mers. 
This is due to the fact that they are hybridising to their 
specific target on the sequencing template and are not being 
fully removed in the washing steps. These slightly higher 
signals could be reduced to the levels of the non-specific 4mers 
by increasing the stringency of the washing steps by lowering the 
ionic strength or increasing the temperature of the wash 
solution . 

The signals obtained for the reactions with ligase of the 
mis -matched 4mers are lower than those obtained from the 
non-specific hybridisation control reactions. This is probably 
due to the presence of a substance in the ligase solution, which 
is not being removed by the washing steps, quenching some of the 
fluorescence. This difference could be removed by improving the 
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washing steps or by including inactivated ligase solution into 
these reactions thereby insuring the same amount of quenching in 
all reactions. 

To ensure that the maximum possible number of cycles may be 
carried out using this method, it is important to ensure that the 
ligation efficiency is very high for each step, so that 
sufficient template is produced for the next cycle. 



t 




9)J 



(35) 



CLAIMS : 

1. A method for sequencing DNA, which comprises: 

(a) obtaining a target DNA population comprising a 
plurality of single- stranded DNAs to be sequenced, each of which 
is present in a unique amount in the same reaction zone and bears 
a primer to provide a double-stranded portion of the DNA for 
ligation thereto; 

(b) contacting the DNA population with an array of 
hybridisation probes, each probe comprising a label cleavably 

attached to a known base sequence of predetermined length, the 
array containing all possible base sequences of that 
predetermined length and the base sequences being incapable of 
ligation to each other, wherein the contacting is carried out 
in the presence of ligase under conditions to ligate to the 
double -stranded portion of each DNA the probe bearing the base 
sequence complementary to the single-stranded DNA adjacent the 
double -stranded portion thereby to form an extended double- 
stranded portion which is incapable of ligation 'to further 
probes ; and 

(c) removing all unligated probes; followed by the steps 



(d) cleaving the ligated probes to release each label; 

(e) recording the quantity of each label; and 

(f) activating the extended double -stranded portion to 
enable ligation thereto; wherein 

(g) steps (b) to (f) are repealed in a cycle for a 
sufficient numbezr of times to determine the sequence of each 
single-stranded DNA by determining the sequence of release of 
each label . 

2. A method according to claim 1, wherein the array comprises 
a plurality of sub-arrays which together contain all the possible 



of: 
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base sequences, and wherein each sub-array is contacted with the 
DNA population according to step (b) , unligated probes are 
removed according to step (c) , and these steps are repeated in 
a cycle before step (d) so that ail of the sub-arrays contact the 
DNA population. 



3. A method according to claim 1 or claim 2, wherein the target 
DMA population is obtained by sorting an initial DNA sample into 
sub-populations and selecting one of the sub-populations as the 
target DNA population. 

4. A method according to claim 3, wherein the initial DNA 
sample is cut into fragments, each having a sticky end of known 
length and unknown sequence, which fragments are sorted into sub- 
populations according to their sticky end sequence. 

5. A method according to any one of the preceding claims, 
wherein, each single-stranded SNA is immcbilised at one end. 

6. A method according to any one of the preceding claims, 
wherein the label of each probe comprises a mass label, and tne 
quantity of each label is recorded according to step (e) using 
mass spectrometry after release of the label in step (d) . 

7. A method according to any one of the preceding claims, 
wherein the known base sequence is blocked at its 3 'OH. 



8 



A method according to claim 7, wherein the step (d) of 
cleaving the ligated probes to release each label unblocks the 
3' -OH of the extended double -stranded portion according to step 
(f) • 



WO 98/31831 PCT/GB98/00130 

(37) 

9/ A method according to claim 8, wherein the label of each 
probe is cleavably attached to the 3 ' -OH of the base sequence. 

10. A method according to any one of claims 1 to 6, wherein the 
base sequence of each probe is unphosphorylaced at both 3' and 
5' ends and step (f) comprises phosphoryiating the 5' -OH of the 
extended double-stranded portion. 

11. A method according to any one of the preceding claims, 
wherein the predetermined length of the base sequence is from 2 
co 6 . 

12. A method according to claim 11, wherein the predetermined 
length of the base sequence is 4 . 

13 . A kit for sequencing DNA, which comprises an array of 
hybridisation probes, each probe comprising a label cleavably 
attached co a known base sequence of predetermined length, the 
array containing all possible base sequences of that 
predetermined length and the base sequences being incapable of 
iigating to each other. 

14. A kit according to claim 13, wherein the known base sequence 
is blocked at its 3' -OH. 

15. A kit according to claim 14, wherein the label of each probe 
is cleavably attached to the 3 ; -OH of the base sequence to 
prevent ligation thereto. 



16. A kit accordng to any one of claims 13 to 15, wherein 
base sequence of each probe is unphosphorylated at both 3' 
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17. A kit according to any one of claims 13 to 16, wherein z 
label of each probe comprises a mass label. 

18. A kit according to any one of claims 13 to 17, wherein t 
predetermined length of the base sequence is from 2 to 6 . 

19. A kit according to claim 18 , wherein the predetermm 
length of the base sequence is 4. 

20 . Use of a kit according to any one of claims 13 to 19 for 
method of sequencing DNA. 
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CYCLE 1 




STEP 1 



PPP ■ 
'AGGT 



STEP 2 



r 




TCCA 
T AGT AGGT 



PRIMER 



STEP 3 




TCCA 
T AGT AGGT 



PRIMER 



STEP 4 



1 




□ NA PREPARED BY VARIOUS 
MEANS FROM ANY SOURCE 

ADD KINASE TO PHOSPHORYLATE 5 ; 
TERMINUS OF PRIMER. ACTIVATING 
THIS TERMINUS FOR EXTENSION BY 
LIGATION OF 4 MER PROBES 




TCCA -OH 



ADD ALL 255 POSSIBLE 4-M5RS IN THE 
PRESENCE OF A LlGASE TO EXTEND 
DOUBLE-STRANDED PRIMER REGION 
BY 4 3ASE PAIRS THE LiGASE ACTS TO 
ENSURE THE FIDELITY OF THE 
EXTENSION 




5 l 

HO-TCCA 
TAGTAGGT 



PRIMER 



WASH IN LOW STRINGENCY EOFrE-. 
OR RAISE TEMPERATURE OF MEDIUM 
TO REMOVE ANY UNLIGATED 4-MERS 




-^BEAD^ 



CLEAVE MASS-LABEL BY PHOTOLYSIS 
OR CHEMICAL CLEAVAGE AND REMOVE 
SUPERNATEMT FOR ANALYSIS IN MASS 
SPECTROMETER. CLEAVAGE LEAVES 
3 ! -OH AVAILABLE FOR FURTHER 
EXTENSION. 



FIG. 1A 



SUBSTITUTE SHEET ( rule 26 ) 
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CYCLE 2: 



STEP 1 



HOTCCA 
TAGTAGGT 



PRIMER 



-^BEAD^ 



ADD KINASE TO PHOSPHORYLATE 5 1 
TERMINUS OF PRIMER. ACTIVATING 
THIS TERMINUS FOR EXTENSION BY 
LIGATION OF 4 MER PR03ES 



STEP 2 



ppp-TCCA 
'TAGTAGGT 



r 



PRIMER 





ATCA -ppc 



ADD ALL 255 D CSSI3L.E 4-MERS IN THE 
PRESENCE OF A LlGASE 7C E-~ENu 
DO'JE^E-STRANCED PRIMER "EGiGN 



3Y 4 BASE PAIRS THE L. 



■ SE ACTS" 



ENSURE THE FIDELITY O- THE 
EXTENSION 




STEP 3 



TAGTAGvj? i 



PRIMER 




C" RA SE ~EMPERA7uRE C e r/EOIL'V 
~C PEVC -E ANY UNL!G~"C 4-f.«ERS 




STEP 4 



. •— * — r~ r* a 

TAGTAGGT 



1 




HO-ATCA TCCA 
TAGTAGGT 



FIG. 1B 



PRIMER 



PRIMER 




CLEAVE MASS-LAEE- 5 t p~ 
OR CHEMICAL CLEAVAGE ^ 
SUPER NATENT FOR ANALY: 
SPECTROMETER. 



:~clys:s 

;C PEMCVE 
:£ If, MASS 



-^BEAD^ 



CLEARLY THIS PROCESS CAN 3E 
ITERATED UNTIL THE ENTIRE NUCLEIC 
ACID HAS BEEN SEQUENCED 



SUBSTITUTE SHEET ( rule 26 ) 
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CYCLE 1: 



HO - 
" AGGT 



PRIMER 



DNA PREPARED BY VARIOUS 
MEANS FROM ANY SOURCE 



-^BEAD^ 




STEP 1 



"cc; 



~PPP 



ADD ALL 255 POSSIELE 4-MERS IN THE 
PRESENCE OF A LI GAS E. TO EXTEND 
DOUBLE- STRANDED PRIMER REGION 
EY 4 BASE PAIRS THE LIGASE ACTS TC 
ENSURE THE FIDELITY OF THE 
EXTENSION 




TCCA 
T AGTAGG i 



PRIMER 




STEP 2 



WASH IN LOW STRINGENCY BUFFER 
OR RAISE TEMPERATURE OF MEDIUM 
TO REMOVE AN V UNL! GATED 4-MERS 




TCCA 



PRIMER 



STEP 3 





3 l 

HO-TCCA 
' TAGTAGGT 



PRIMER 



CLEAVE MASS-LABEL EY PHOTOLYSIS 
OR CHEMICAL CLEAVAGE AND REMOVE 
SUPERNATENT FOR ANALYSIS IN MASS 
SPECTROMETER CLEAVAGE LEAVES 
3 ! -OH AVAILABLE FOR FURTHER 
EXTENSION 



-^BEAD^ 



] 



FIG. 2A 
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CYCLE 2: 



STEP 1 



HO-TCCA 
TAGTAGGT 



PRIMER 




A t CA -ppp 



r 



ADD ALL 255 POSSIBLE 4-MERS IN THE 
PRESENCE OF A LI GAS E, TO EXTEND 
DOUBLE-STRANDED PRIMER REGION 
3Y 4 BASE PAIRS THE LiGASE ACTS TO 
ENSURE THE FIDELITY OF THE 
EXTENSION 




STEP 2 



ATCA t CCA 
i AGTAGG i 



PRIMER 




WASH IN LOW STRINGENCY E'JFFER 
OR RAISE TEMPERATURE OF MEDIUM 
TO REMOVE ANY UN Li GAT ED 4-MERS 




ATCA TCCA 
TAG i AGG i 



PRIMER 



STEP 3 




HO-ATCA TCCA 
TAGTAGGT 



PRIMER 



FIG. 2B 




CLEAVE MASS-LABEL BY PHOTOLYSIS 
OR CHEMICAL CLEAVAGE AND REMOVE 
SUPERNATENT FOR ANALYSIS IN MASS 
SPECTROMETER 



-^BEAD^ 



CLEARLY THIS PROCESS CAN BE 
ITERATED UNTIL THE ENTIRE NUCLEIC 
ACID HAS BEEN SEQUENCED 



SUBSTITUTE SHEET ( rule 26 ) 
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BSOT1N 
ADAPTOR 1 



NNNGGCC 
NNNCCGG 



3soMI RECOGNITION 
SEQUENCE 



ADAPTOR 2 



NNNGCAGGTNNNN 
NNNCGTCCANNNN 



THIS ADAPTOR WOULD ALLOW 
FRAGMENTS TO EE SORTED AFTER A 
DETERMINING A TERMINUS 



Esaf RECOGNITION 
SEQUENCE 



ALTERNATIVE 
ADAPTOR 2 



NNNGGTCTCNNNN 
NNNCCAGAGNNNN 



THIS ADAPTOR WOULD ALLOW 
FRAGMENTS TO EE REIMMOBiUSED 
ON BEADS USING THE KNOWN Bsai 
STICKY -END THIS WOULD ALLOW 
FURTHER PROCESSING BEFORE 
BEGINNING TO SEQUENCE. 



FIG. 3 



ADAPTORS TO GENERATE DISTINCT 
TERMiN! IN GENERIC NUCLEIC ACIDS 
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DNA CLEAVED WITH A RESTRICTION 
ENZYME LIKE BamHI 



PPP ■ 
5ATC 



CTAG 
PPP 
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GATC 



PPP 
GATC 



CTAG 
PPP 



HO- 
HO-GATC 



CTAG 
PPP 



PRIMER 



NICK REMAINS 




ADD BEADS SEARING A DOUBLE-STRANDED 
OLIGONUCLEOTIDE WITH A STICKY-END 
COMPLEMENTARY TO RESTRICTION 
ENZYME STICKY-ENDS BUT WITH NO 
TERMINAL -HOSPHATE GROUPS 



CTAG 
. GATC 



GATC 



PRIMER 



C 
T 
A 
G 



■ GATC 



PRIMER 




THERMALLY DENATURE STRANDS SO 
NICKED STRAND IS RELEASED WASH Tr 
"REE DNA AWAY 




GATC 



-GATC 



PRIMER 



SINGLE-STRANDED PRIMED DNA REMAINS 



-^3^D^ 



FIG. 4 



DIAGRAM TO SHOW PREPARATION OF 
SINGLE-STRANDED DNA FOR PRIMER 
EXTENSION SEQUENCING. 
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application(s) designating the United States of America that is/are listed below and, insofar as the subject matter of each of the 
claims of this application is not disclosed in that/those prior application(s) in the manner provided by the first paragraph of Title 
35, United States Code, §112, I acknowledge the duty to disclose to the Office all information known to me to be material to the 
patentability as defined in Title 37, Code of Federal Regulations §1.56, which became available between the filing date of the prior 
application(s) and the national or PCT international filing date of this application: 



PRIOR U.S. APPLICATIONS OR PCT INTERNATIONAL APPLICATIONS DESIGNATING THE U.S. FOR BENEFIT UNDER 35 U.S.C. 120: 



U.S. APPLICATIONS 


STATUS (check one) 


U.S. APPLICATION NUMBER 


U.S. FILING DATE 


PATENTED 


PENDING 


ABANDONED 
































PCT APPLICATIONS DESIGNATING THE U.S. 








; y PCT APPLICATION NO. 


PCT FILING DATE 


U.S. APPLICATION NUMBERS 
ASSIGNED (if any) 













































l hereby appoint the following attorneys and agent(s) to prosecute said application and to transact all business in the Patent and 
JJademark Office connected therewith and to file, prosecute and to transact all business in connection with international applications 
greeted to said invention: 



William L. Mathis 




R. Danny Huntington 


„2WK>3- 


Gerald F. Swiss 


30,113 


d3 Robert s - Swecker 




Eric H. Weisblatt 


30.505 


Michael J. Ure 


33,089 


71 Platon N. Mandros 


" 22,124 


James W. Peterson 


26.057 


Charles F. Wieland III 


* 33,096 


Benton S. Duffett, Jr. 




Teresa Stanek Rea 


30,427 


Bruce T. Wieder 


33JU5 


£^A,-^Nornian H. Stepno 


22,716 


Robert E. Krebs 


'25,885 


Todd R. Walters 


34.P40 


/ Ronald L. Grudziecki 


~2f,970 


William C. Rowland 


3<L8S8_ 


Ronni S . Jillions 


31,979 


f Frederick G. Michaud, Jr. 


"253)03 ~ 


T. Gene Dillahunty 


25,423 


Harold R. Brown EI 


36,341 


Alan E. Kopecki 


25^13 


Patrick C. Keane 


32,858 


Allen R. Baum 


"36,086 


Regis E. Slutter 


2^299- 


Bruce J. Boggs, Jr. 


32,344 ' 


Steven M. du Bois 


35,023 


Samuel C. Miller, III 


27^6CL 


William H. Benz 




Brian P. O'Shaughnessy 


32,747 


Robert G. Mukai 




Peter K. Skiff 


31,917 . 




George A. Hovanec, Jr. 


28,223- 


Richard J. McGrath 








James A. LaBarre 


' 28,632 


Matthew L, Schneider 


32,8 14 






E. Joseph Gess 


J2&510— 


Michael G. Savage 


32,596 







and: Robin L. Teskin, Reg.JS[oJ?5,030 

Address all correspondence to: SajrmeJjC^.JVtilleXjJIL 

Burns, Doane l Swecker & Mathis, L.L.P 
lLoJ~Box44G4-~ 
Akxandjn^^ 

Address all telephone calls to: Samuel C. Miller, III 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made on information and 
belief are believed to be true; and further that these statements were made with the knowledge that willful false statements and the 
like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that 
such willful false statements may jeopardize the validity of the application or any patent issued thereon. 



at £703) 836-6620. 
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020600-280 


„ FULL NAME OF SOLE OR FIRST INVENTOR 

] } 

Hjunter Schmidt 




DATE 


RESIDENCE \ 
Houghton Manor, Houghton, Cambridge_PB17 2BQ, United Kingdom /-^/^X || 


CITIZENSHIP * 
German 


POST OFFICE ADDRESS V 
Houghton Manor, Houghton, Cambridge PE17 2BQ, United Kingdom /Jjfo 


FULL NAME OF SECOND JOINT INVENTOR, IF ANY 

^^ndrje^LHugin Thompson 


SIGN^^^^^^ 


DATE , , 


RESIDENCE // / / 
25 Knoll Park, ^JUo^aLy^Ayr KA7 4RH, United Kingdom /£> X 


CITIZENSHIP * * 
United Kingdom 


POST OFFICE ADDRESS 

25 Knoll Park, Alloway, Ayr KA7 4RH, United Kingdom 


FULL NAME OF THIRD JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


FllLL NAME OF FOURTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


£Q£T OFFICE ADDRESS 


TOLL NAME OF FIFTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


fQST OFFICE ADDRESS 


EtfLL NAME OF SIXTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


FULL NAME OF SEVENTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


FULL NAME OF EIGHTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


FULL NAME OF NINTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 
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